Classification and Regression via Integer Optimization
نویسندگان
چکیده
Motivated by the significant advances in integer optimization in the past decade, we introduce mixed-integer optimization methods to the classical statistical problems of classification and regression and construct a software package called CRIO (classification and regression via integer optimization). CRIO separates data points into different polyhedral regions. In classification each region is assigned a class, while in regression each region has its own distinct regression coefficients. Computational experimentations with generated and real data sets show that CRIO is comparable to and often outperforms the current leading methods in classification and regression. We hope that these results illustrate the potential for significant impact of integer optimization methods on computational statistics and data mining.
منابع مشابه
Fuzzy Mixed Integer Optimization Model for Regression Approach
Mixed Integer Optimization has been a topic of active research in past decades. It has been used to solve Statistical problems of classification and regression involving massive data. However, there is an inherent degree of vagueness present in huge real life data. This impreciseness is handled by Fuzzy Sets. In this Paper, Fuzzy Mixed Integer Optimization Method (FMIOM) is used to find solutio...
متن کاملFeature subset selection for logistic regression via mixed integer optimization
This paper concerns a method of selecting a subset of features for a logistic regression model. Information criteria, such as the Akaike information criterion and Bayesian information criterion, are employed as a goodness-offit measure. The feature subset selection problem is formulated as a mixed integer linear optimization problem, which can be solved with standard mathematical optimization s...
متن کاملVariable Selection via A Combination of the L0 and L1 Penalties
Variable selection is an important aspect of high-dimensional statistical modelling, particularly in regression and classification. In the regularization framework, various penalty functions are used to perform variable selection by putting relatively large penalties on small coefficients. The L1 penalty is a popular choice because of its convexity, but it produces biased estimates for the larg...
متن کاملSpectral Energy Minimization for Semi-supervised Learning
The use of unlabeled data to aid classification is important as labeled data is often available in limited quantity. Instead of utilizing training samples directly into semi-supervised learning, energy function incorporating the conditional probability of classification is adopted. The semi-supervised learning is posed as the optimization of both the classification energy and the cluster compac...
متن کاملHyperspectral segmentation with active learning
This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation, with two main steps: (a) learning, for each class label, the posterior probability distributions, based on a multinomial logistic regression model; (b) segmenting the hyperspectral image, based on the posterior probability distribution learnt in step (a) and on a multi-level logistic prior encoding the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Operations Research
دوره 55 شماره
صفحات -
تاریخ انتشار 2007